AITopics | skolkovo institute

Collaborating Authors

skolkovo institute

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SafeHumanoid: VLM-RAG-driven Control of Upper Body Impedance for Humanoid Robot

Mahmoud, Yara, Sam, Jeffrin, Khang, Nguyen, Fernando, Marcelino, Tokmurziyev, Issatay, Cabrera, Miguel Altamirano, Khan, Muhammad Haris, Lykov, Artem, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceDec-1-2025

Safe and trustworthy Human Robot Interaction (HRI) requires robots not only to complete tasks but also to regulate impedance and speed according to scene context and human proximity. We present SafeHumanoid, an egocentric vision pipeline that links Vision Language Models (VLMs) with Retrieval-Augmented Generation (RAG) to schedule impedance and velocity parameters for a humanoid robot. Egocentric frames are processed by a structured VLM prompt, embedded and matched against a curated database of validated scenarios, and mapped to joint-level impedance commands via inverse kinematics. We evaluate the system on tabletop manipulation tasks with and without human presence, including wiping, object handovers, and liquid pouring. The results show that the pipeline adapts stiffness, damping, and speed profiles in a context-aware manner, maintaining task success while improving safety. Although current inference latency (up to 1.4 s) limits responsiveness in highly dynamic settings, SafeHumanoid demonstrates that semantic grounding of impedance control is a viable path toward safer, standard-compliant humanoid collaboration.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.233

Country: Europe > Russia (0.17)

Genre:

Research Report (0.70)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Add feedback

VLH: Vision-Language-Haptics Foundation Model

Fuentes, Luis Francisco Moreno, Khan, Muhammad Haris, Cabrera, Miguel Altamirano, Serpiva, Valerii, Iarchuk, Dmitri, Mahmoud, Yara, Tokmurziyev, Issatay, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceAug-5-2025

We present VLH, a novel Visual-Language-Haptic Foundation Model that unifies perception, language, and tactile feedback in aerial robotics and virtual reality. Unlike prior work that treats haptics as a secondary, reactive channel, VLH synthesizes mid-air force and vibration cues as a direct consequence of contextual visual understanding and natural language commands. Our platform comprises an 8-inch quadcopter equipped with dual inverse five-bar linkage arrays for localized haptic actuation, an egocentric VR camera, and an exocentric top-down view. Visual inputs and language instructions are processed by a fine-tuned OpenVLA backbone - adapted via LoRA on a bespoke dataset of 450 multimodal scenarios - to output a 7-dimensional action vector (Vx, Vy, Vz, Hx, Hy, Hz, Hv). INT8 quantization and a high-performance server ensure real-time operation at 4-5 Hz. In human-robot interaction experiments (90 flights), VLH achieved a 56.7% success rate for target acquisition (mean reach time 21.3 s, pose error 0.24 m) and 100% accuracy in texture discrimination. Generalization tests yielded 70.0% (visual), 54.4% (motion), 40.0% (physical), and 35.0% (semantic) performance on novel tasks. These results demonstrate VLH's ability to co-evolve haptic feedback with perceptual reasoning and intent, advancing expressive, immersive human-robot interactions.

artificial intelligence, haptic feedback, human computer interaction, (13 more...)

arXiv.org Artificial Intelligence

2508.01361

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Education (0.68)
Transportation > Air (0.47)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.91)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.68)

Add feedback

Machine Learning-Driven Compensation for Non-Ideal Channels in AWG-Based FBG Interrogator

Kazakov, Ivan A., Kulichenko, Iana V., Kovalev, Egor E., Treskova, Angelina A., Barma, Daria D., Malakhov, Kirill M., Oseledets, Ivan V., Shipulin, Arkady V.

arXiv.org Artificial IntelligenceJul-17-2025

We present an experimental study of a fiber Bragg grating (FBG) interrogator based on a silicon oxynitride (SiON) photonic integrated arrayed waveguide grating (AWG). While AWG-based interrogators are compact and scalable, their practical performance is limited by non-ideal spectral responses. To address this, two calibration strategies within a 2.4 nm spectral region were compared: (1) a segmented analytical model based on a sigmoid fitting function, and (2) a machine learning (ML)-based regression model. The analytical method achieves a root mean square error (RMSE) of 7.11 pm within the calibrated range, while the ML approach based on exponential regression achieves 3.17 pm. Moreover, the ML model demonstrates generalization across an extended 2.9 nm wavelength span, maintaining sub-5 pm accuracy without re-fitting. Residual and error distribution analyses further illustrate the trade-offs between the two approaches. ML-based calibration provides a robust, data-driven alternative to analytical methods, delivering enhanced accuracy for non-ideal channel responses, reduced manual calibration effort, and improved scalability across diverse FBG sensor configurations.

artificial intelligence, interrogator, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LSENS.2025.3585057

2506.13575

Country: North America > United States > California > Yolo County > Davis (0.14)

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Add feedback

Echo: An Open-Source, Low-Cost Teleoperation System with Force Feedback for Dataset Collection in Robot Learning

Bazhenov, Artem, Satsevich, Sergei, Egorov, Sergei, Khabibullin, Farit, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceApr-11-2025

In this article, we propose Echo, a novel joint-matching teleoperation system designed to enhance the collection of datasets for manual and bimanual tasks. Our system is specifically tailored for controlling the UR manipulator and features a custom controller with force feedback and adjustable sensitivity modes, enabling precise and intuitive operation. Additionally, Echo integrates a user-friendly dataset recording interface, simplifying the process of collecting high-quality training data for imitation learning. The system is designed to be reliable, cost-effective, and easily reproducible, making it an accessible tool for researchers, laboratories, and startups passionate about advancing robotics through imitation learning. Although the current implementation focuses on the UR manipulator, Echo architecture is reconfigurable and can be adapted to other manipulators and humanoid systems. We demonstrate the effectiveness of Echo through a series of experiments, showcasing its ability to perform complex bimanual tasks and its potential to accelerate research in the field. We provide assembly instructions, a hardware description, and code at https://eterwait.github.io/Echo/.

artificial intelligence, arxiv preprint arxiv, machine learning, (12 more...)

arXiv.org Artificial Intelligence

2504.07939

Country: Europe > Russia (0.16)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Quantization of Large Language Models with an Overdetermined Basis

Merkulov, Daniil, Cherniuk, Daria, Rudikov, Alexander, Oseledets, Ivan, Muravleva, Ekaterina, Mikhalev, Aleksandr, Kashin, Boris

arXiv.org Artificial IntelligenceApr-15-2024

In this paper, we introduce an algorithm for data quantization based on the principles of Kashin representation. This approach hinges on decomposing any given vector, matrix, or tensor into two factors. The first factor maintains a small infinity norm, while the second exhibits a similarly constrained norm when multiplied by an orthogonal matrix. Surprisingly, the entries of factors after decomposition are well-concentrated around several peaks, which allows us to efficiently replace them with corresponding centroids for quantization purposes. We study the theoretical properties of the proposed approach and rigorously evaluate our compression algorithm in the context of next-word prediction tasks and on a set of downstream tasks for text classification. Our findings demonstrate that Kashin Quantization achieves competitive or superior quality in model performance while ensuring data compression, marking a significant advancement in the field of data quantization.

algorithm, matrix, quantization, (13 more...)

arXiv.org Artificial Intelligence

2404.09737

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
Asia > Russia (0.05)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

DogSurf: Quadruped Robot Capable of GRU-based Surface Recognition for Blind Person Navigation

Bazhenov, Artem, Berman, Vladimir, Satsevich, Sergei, Shalopanova, Olga, Cabrera, Miguel Altamirano, Lykov, Artem, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceFeb-5-2024

This paper introduces DogSurf - a newapproach of using quadruped robots to help visually impaired people navigate in real world. The presented method allows the quadruped robot to detect slippery surfaces, and to use audio and haptic feedback to inform the user when to stop. A state-of-the-art GRU-based neural network architecture with mean accuracy of 99.925% was proposed for the task of multiclass surface classification for quadruped robots. A dataset was collected on a Unitree Go1 Edu robot. The dataset and code have been posted to the public domain.

dogsurf, robot, slippery surface, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3610978.3640606

2402.03156

Country:

North America > United States > Colorado > Boulder County > Boulder (0.15)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.06)
Asia > Russia (0.06)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.71)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

CognitiveDog: Large Multimodal Model Based System to Translate Vision and Language into Action of Quadruped Robot

Lykov, Artem, Litvinov, Mikhail, Konenkov, Mikhail, Prochii, Rinat, Burtsev, Nikita, Abdulkarim, Ali Alridha, Bazhenov, Artem, Berman, Vladimir, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceJan-17-2024

This paper introduces CognitiveDog, a pioneering development of quadruped robot with Large Multi-modal Model (LMM) that is capable of not only communicating with humans verbally but also physically interacting with the environment through object manipulation. The system was realized on Unitree Go1 robot-dog equipped with a custom gripper and demonstrated autonomous decision-making capabilities, independently determining the most appropriate actions and interactions with various objects to fulfill user-defined tasks. These tasks do not necessarily include direct instructions, challenging the robot to comprehend and execute them based on natural language input and environmental cues. The paper delves into the intricacies of this system, dataset characteristics, and the software architecture. Key to this development is the robot's proficiency in navigating space using Visual-SLAM, effectively manipulating and transporting objects, and providing insightful natural language commentary during task execution. Experimental results highlight the robot's advanced task comprehension and adaptability, underscoring its potential in real-world applications. The dataset used to fine-tune the robot-dog behavior generation model is provided at the following link: huggingface.co/datasets/ArtemLykov/CognitiveDog_dataset

experiment, robot, skolkovo institute, (14 more...)

arXiv.org Artificial Intelligence

2401.09388

Country: North America > United States > Colorado > Boulder County > Boulder (0.06)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)

Add feedback

LLM-MARS: Large Language Model for Behavior Tree Generation and NLP-enhanced Dialogue in Multi-Agent Robot Systems

Lykov, Artem, Dronova, Maria, Naglov, Nikolay, Litvinov, Mikhail, Satsevich, Sergei, Bazhenov, Artem, Berman, Vladimir, Shcherbak, Aleksei, Tsetserukou, Dzmitry

arXiv.org Artificial IntelligenceDec-14-2023

This paper introduces LLM-MARS, first technology that utilizes a Large Language Model based Artificial Intelligence for Multi-Agent Robot Systems. LLM-MARS enables dynamic dialogues between humans and robots, allowing the latter to generate behavior based on operator commands and provide informative answers to questions about their actions. LLM-MARS is built on a transformer-based Large Language Model, fine-tuned from the Falcon 7B model. We employ a multimodal approach using LoRa adapters for different tasks. The first LoRa adapter was developed by fine-tuning the base model on examples of Behavior Trees and their corresponding commands. The second LoRa adapter was developed by fine-tuning on question-answering examples. Practical trials on a multi-agent system of two robots within the Eurobot 2023 game rules demonstrate promising results. The robots achieve an average task execution accuracy of 79.28% in compound commands. With commands containing up to two tasks accuracy exceeded 90%. Evaluation confirms the system's answers on operators questions exhibit high accuracy, relevance, and informativeness. LLM-MARS and similar multi-agent robotic systems hold significant potential to revolutionize logistics, enabling autonomous exploration missions and advancing Industry 5.0.

language model, llm, robot, (16 more...)

arXiv.org Artificial Intelligence

2312.09348

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Russia (0.05)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.05)
(14 more...)

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GP CC-OPF: Gaussian Process based optimization tool for Chance-Constrained Optimal Power Flow

Mitrovic, Mile, Kundacina, Ognjen, Lukashevich, Aleksandr, Vorobev, Petr, Terzija, Vladimir, Maximov, Yury, Deka, Deepjyoti

arXiv.org Artificial IntelligenceFeb-16-2023

As an optimization tool, the OPF is typically used to solve the Economic dispatch (ED) problem by finding the optimal output of the controllable generators with the lowest possible cost that meets the load and physical constraints of the grid. However, the OPF is a complex non-linear problem with many constraints that can be hard to solve. In addition, the rapid integration of renewable energy resources (RES) with intermittent outputs propagates uncertainty through the grid and thus leads to a higher degree of complexity in power grid operations. To take into account the impacts of uncertainty within the OPF, the researchers have recently proposed several stochastic approaches such as robust optimization [1], probabilistic OPF [2], and Chance-Constrained (CC) OPF [3, 4]. Robust optimization often leads to conservative solutions, while probabilistic OPF is difficult to implement in practice. The CC-OPF implies satisfying probability constraints with a given acceptable violation probability, balancing operating costs and security in the power grid in that way.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

2302.08454

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.06)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.06)
Asia > Russia (0.06)
Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)

Genre: Research Report (0.41)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Predicting spatial distribution of Palmer Drought Severity Index

Grabar, V., Lukashevich, A., Zaytsev, A.

arXiv.org Artificial IntelligenceSep-1-2022

The probability of a drought for a particular region is crucial when making decisions related to agriculture. Forecasting this probability is critical for management and challenging at the same time. The prediction model should consider multiple factors with complex relationships across the region of interest and neighbouring regions. We approach this problem by presenting an end-to-end solution based on a spatio-temporal neural network. The model predicts the Palmer Drought Severity Index (PDSI) for subregions of interest. Predictions by climate models provide an additional source of knowledge of the model leading to more accurate drought predictions. Our model has better accuracy than baseline Gradient boosting solutions, as the $R^2$ score for it is $0.90$ compared to $0.85$ for Gradient boosting. Specific attention is on the range of applicability of the model. We examine various regions across the globe to validate them under different conditions. We complement the results with an analysis of how future climate changes for different scenarios affect the PDSI and how our model can help to make better decisions and more sustainable economics.

palmer drought severity index, prediction, spatial distribution, (12 more...)

arXiv.org Artificial Intelligence

2208.14833

Country:

North America > United States > Iowa (0.05)
Europe > Russia (0.04)
Europe > Poland (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.33)

Add feedback